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Abstract. In this paper, we study the interaction of an antisonse RNA and its target mRNA, 
based on the model introduced by Alkan et al. (Alkan et al., J. Comput. Biol., Vol:267-282, 
2006). Our main results are the derivation of the partition function 1111 (Chitsaz et al., Bioin- 
formatics, to appear, 2009), based on the concept of tight-structure and the computation of the 
base pairing probabilities. This paper contains the folding algorithm rip which computes the 
partition function as well as the base pairing probabilities in 0{N'^M^) + 0(Af2M*) time and 
0{N^ M^) space, where A'', M denote the lengths of the interacting sequences. 



RNA-RNA interaction, joint structure, dynamic programming, partition function, base pairing 
probability, loop, RNA secondary structure. 



1. Introduction 



The discovery of small RNAs that bind to their target mRNAs in order to prohibit their translation 
and down-regulate the expression levels of corresponding genes has drawn a lot of attention in the 
RNA world [21]. Studies have shown that many RNA-RNA interactions play a significant role 
in different cellular processes, such as mediate pseudouridylation and methylation of rRNA [4], 
nucleotide insertion into mRNAs [6] , splicing of pre- mRNA [35j and translation control or plasmid 
replication control [SI [T^ [TS] . 



Regulatory RNAs constitute a subclass of the antisense RNA family; encompassing the snRNAs, 
gRNAs and snoRNAs that play a role in the context of rRNA modification, RNA editing, mRNA 
spicing and plasmid copy-number regulation. In addition, antisense RNAs arc synthesized for 
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studying specific gene functions. Since the first publisfied result on natural antiscnsc RNAs wliicli 
regulate gene expression in C. elegans [25l [SH [131 [27] , Drosophila [24], and other organisms [31], the 
problem of predicting how two nucleic acid strands intcract-the so called RNA-RNA interaction 
problem (RlP)-has come into focus. 

As observed by Alkan et al. [5], the RIP is NP-complete. The actual argument constitutes an 
extension of the work of Akutsu [T] derived in the context of single RNA secondary structure 
prediction problems with pseudoknots. As in Rivas and Eddys pseudoknot folding algorithm 
[22] the general idea here is to consider specific classes of interactions, that can be computed 
via dynamic programming routines. There are several other methods that consider somewhat 
restricted versions of the RNA-RNA interaction. For instance, one method concatenates the two 
interacting sequences and subsequently employs a slightly modified standard secondary structure 
folding algorithm. The algorithms RNAcofold [Il[7], pairfold [3] and NUPACK [H] subscribe to 
this strategy. However, this approach cannot predict important motifs in RlPs, as for instance 
kissing hairpin loops. The concatenation idea has also been employed using the pseudoknot folding 
algorithm of Rivas and Eddy [52]. The resulting algorithm, however, does still not generate all 
relevant interaction structures [TT] [5B] . An alternative line of thought is to neglect all internal 
base-pairings in either strand and to compute the minimum free energy (mfe) secondary structure 
for their hybridization under this constraint. For instance, RNAduplex follows this line of thought 
making it formally equivalent to the classic secondary structure folding algorithm of Waterman 
[32 ] [TS] , [33 ] I30j . Furthermore we have the algorithm RNAup [23] [22 which uses the Alkan's model, 
allowing for one interaction region having unbranched interactions within any loop. RNAup can 
therefore capture single but not multiple kissing hairpins. Finally there is IntaRNA [8] facilitating 
the efficient prediction of bacterial sRNA targets incorporating target site accessibility and seed 
regions. 

Alkan et al. [2] derived a mfe algorithm for predicting the joint secondary structure of two in- 
teracting RNA molecules with polynomial time complexity. Here "joint structure" , see Fig. [1] for 
example, means that the intramolecular structures of each molecule are pseudoknot-free, the in- 
termolecular binding pairs are noncrossing and there exist no so called "zig-zags" (see Section [T] 
for details). Zig-zags are sometimes referred to as tangles. 

Recently, Chitsaz et.al. presented a dynamic programming algorithm which computes the 

partition function in 0{N^) time. The key point for passing from the mfe folding of Alkan [2] to 
the partition function is a unique grammar by which each interaction structure can be generated. 
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The dynamic programming routine for the partition function of RNA secondary structures is due to 
McCaskill and can be outhned as follows: the free energy of a secondary structure is assumed 
additive in terms of its loops F{S) = J2l£S^l, where Fj^ denotes the free energy of a loop,i. 
The additivity of the free energy translates itself into the niultiplicativity in the contributions to 
the partition function Q defined by Q = J^s e^^'-^-'/'^"^, where Q is the sum over all the secondary 
structures S of length M. This factorization of terms can be realized by introducing Q^{i,j), where 
the sum is taken over all substructures S\i,j] on the segment for which (S'[i], ^[j]) S 5'[«,j] 
and Q'^{i,j) for all the configurations on irrespective of whether or not i,j are connected. In 
particular, we have (5'*(1, M) = Q. Consequently, we arrive at the recursion, see Fig. [3] 

(1.1) Q^z,j) = 1 + Y^ Q'{i, h - l)Q''{h,£). 

Let us next recall the basic loops-types upon which the partition function and energy parameters 
of RNA secondary structures arc based: 

(1) a hairpin-\oop (Ha(i, j)), is a pair ((i, j), [i + 1, j — 1]), where is an arc and [i + l, j — 1] is 
an interval, i.e. a sequence of consecutive vertices (i, z + 1, . . . , j — 1, j), having energy parameter 

p-G"'{t,j)/kT^ 

(2) an interior-loop (lnt(ii, ji; 12, ^2)), is a sequence ((ii, ji), [ii + 1, 22 - 1], {12,12), [32 + 1, ji - 1]), 
where (i2, j2) is nested in (ii, ji) having the energy parameter e^'^" («i ji:«2 j2)/feT 

(3) a multi-loop (M(io, jo)), see FiglS] is a sequence 

(1.2) {[io,ii - 1], ((n,ji), [n + - 1]), . . . , iiit,jt), [it + l,jt - 1]), [jt + 1, Jo]) 

having energy parameter e~("i+"2(t-i-i)+a3C2)/fcT^ where 01,02,013 G M, t is the number of R[io + 
1, jo — l]-maximal arcs inside R[io,jo] and C2 is the number of isolated vertices contained in [io, jo]- 
Based on the above loop-energies, we obtain the following recursion for Q^{i,j) 
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32 30 J — -f ^J-^ 




Figure 2. The standard loop- types for RNA secondary structures: hairpin- loop (top), 
interior-loop (middle) and multi-loop (bottom). 



'G"'\i,jMM)/kT 
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where 



ml 



J2 Q''(fc,j>-("^+"^(^-'))/'=^ 

,<£<j 

E Q'"'(*,fc)(Q"(^ + i,j) + ' 

i<e<j 



The key idea in this paper, which eventually leads to the derivation of both: the partition function 
as well as the base pairing probabilities, is the concept of a "tight structure" , introduced in Sec- 
tion [21 The tight structure plays a central role in our grammar and is the main tool for obtaining 
the base pairing probabilities. This paper includes the folding algorithm rip, which derives the 
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Isolated segment 

' ' Arbitrary secondary structure 

(^"b~^ secondary structure with ends bounded 

L.-fX!...'; secondary structure with at least one loop inside 

Figure 3. The unique decomposition of secondary structures. 

partition function as well as the base pairing probabilities in O(A^^Af^) + 0{N'^M^) time and 
0{N^A'P) space. The source code of rip is available upon request. 



2. Combinatorics of interaction structures 



In this section we discuss some combinatorial properties of RNA interaction structures. The key 
idea introduced here is that of a tight structure. The main results of this section are: 

• there exist only four "types" of tight structures 

• given a joint structure J{i,j;h,£), each interaction bond (i?[io], S'[jo]) S J{i,j;h,£) is contained 
in a unique J{i,j; /i,£)-tight structure 

• each joint structure uniquely decomposes into a sequence of tight structures and secondary struc- 
ture segments 

• there exists a unique (but not canonical) decomposion of a tight structure. 



Let us begin by making precise what we mean by interaction structures. Suppose we are given 
two diagrams [HI [171 [9l [10] , R a-nd S of length and M, respectively. Let R[i] and S[i] denote 
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the vertex i of R and S, respectively. We shall assume that R[l] denotes the 5' end of R and 
^[l] denotes the 3' end of S as RNA sequences. The induced subgraph of S with respect to the 
subsequence {S[i], . . . iS[j]) is denoted by S[i,j]. In particular, S[i,i] = S[i] and S[i,i — 1] — 0. 
A complex C{R, S, I) is a graph consisting of R, S and a set of arcs of the form ^[j]), /, see 

Fig. m We shall represent a complex C by drawing R on top of S with the i?-arcs in the upper, 
the 5-arcs in the lower halfplane and /-arcs vertical. Given a complex C, a subcomplex is the 
subgraph of C, induced by R[ii,ji] and 5[i2, j2]- 




Figure 4. A complex C induced by R[l, 14] and 5(1, 13]. 

An arc is called interior if its start and cndpoint are both contained in either R or S and exterior, 
otherwise. Let -<i be the partial order over the set of interior arcs, given by 

(2.1) {S[ll],S[n])^l{S[l2],S[j2]) ^ Z2<*1<J1<J2. 

Similarly, let -<2 denote the partial order over the set of exterior arcs 

(2.2) {S[h],R[ji]) ^2 {S[i2iR[j2]) ^ il< 12, Jl < .h. 

Given an external arc, (i?[i], S'[j]), an interior arc (i?[ii], i?[ji]) is called its i?-ancestor if ii < i < ji 
and (£'[12], S'[j2]) is the S'-ancestor of (i?[i], S'[j]) if 12 < j < j2, respectively. We call (i?[i], S'[j]) 
the descendant of (i?[ii], i?[ji]) and (>S'[?2], 5'[j2]) and the sets of i?-ancestors and S'-anccstors of 
(i?[z], S'[j]) are denoted by ^_R(i?[i], 5[j]) and As{R[i], S[j]). The ^i-minimal i?-ancestor and S'- 
ancestor of (i?[z], 5[j]) are called its i?-parent and S-parent, see Fig.O Finally, we call (i?[ii], i?[ji]) 
and (>S'[i2], ^[j^]) dependent if they have a common descendant and independent, otherwise. 

Suppose C = {R', S', I') is a subcomplex induced by R' = R\ii,ji] and S' = 5'[i2, J2] and suppose 
furthermore there exists an exterior arc, (i?[a], S'[6]), with ancestors (i?[i], i?[j]) and (^[i'], ^[j']). 
The arc {R\i],R[j]) is C'-subsumed in {S[i'], S[j']), if for any {R[k],S[k']) € /' with i < k<j, there 
exists some k' such that i' < k' < j' . In case of C — C, we call {R[i], R[j]) simply "subsumed" in 
(^[z'], ^[j']), see Fig. [HI If (i?[ii], i?[ji]) is subsumed in (S'[z2], S'[j2]) and vice versa, we call these 
arcs equivalent. 
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Figure 5. Ancestors and parents: for the exterior arc (7i[3], 5(4]), we have the fol- 
lowing ancestor sets Ar{R[3], S[4]) = {{R[l], R[6]), {R[2], R[4])} and AsiR[S\, S[4]) = 
{(S'[2],5'[6]),(S[3],S[5])}. In particular, {R[2],R[4]) and (S[3],5'[5]) are the 7i-parent 
and S'-parent respectively. 



Figure 6. Subsumed and equivalent arcs: (_R[1], _R[8]) subsumes (S[l],5'[4]) and 
(SfSl.SfS]). Furthermore, {R[2],R[5]) is equivalent to (Sfl], S'[4]). 



A joint structure, J{R[i, j]; S[h, £],!') = J{i, j; h, £) is a subcomplex of C{R, S, I) with the following 
properties, see Fig. [T] 

• R, S are secondary structures 

• there exist no external pseudoknots, i.e. if 5'[ji]), (i?[z2]7 5'[j2]) S /' where ii < i2, then 
ji < 32- 

• there exist no "zig-zags", see Fig|51 I.e. if and (5'[«2], <S'[j2]) are dependent, then 
either is subsumed by (5'[«2], <S'[j2]) or vice versa. In absence of exterior arcs we 




Figure 7. A joint structure induced by R[l, 24] and 23]. 



refer to a joint structure as a secondary structure segment, or segment for short. We call S'[ii,ji] 
maximal if there exists no segment, S[i,j], containing S[ii,ji]. We remark that the idea of a joint 
structure goes back to [5] and has also been utilized in One key idea in our approach is to 
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introduce a specific joint structure, called a tight, which is in some sense a generalization of the 
loop. It can be viewed as the transitive closure of a loop with respect to exterior arcs. 




Figure 8. A zig-zag, generated by {R[2],S[1]), {R[S\, S[S\) and (i?[5], 5(4]). 

Let J(a, 6; c, d) be a fixed joint structure. A joint structure, J(i, j; h, £) C J(a, b; c, d) is J(a, 6; c, d)- 
tight (or tight in J(a, 6; c, c?)) if: 

• there exists at least one exterior arc {R[ii], S[ji]) 

• for any {R[ii], S[ji]), we have 

(2.3) {AR{R[iilS[ji]) U AsiR[ii], S[ji])) n J(a, 6; c, d) e J(i, j; /i,^) 

• J{i,j;h,£) is minimal with respect to C. 



Given a tight (tjs), J{i,j; h,i), we observe that neither one of the vertices h and £, are start 
or endpoint of a segment. In particular, h and £ arc not isolated. In combination with the non 
zig-zag property, we observe that there are only the following four types of tights (v)i (A), (□) or 
(o), see FigH 

(V): iR[^,R[j]) e J{hj;h,£) and {S[h], S[£]) <^ J{i, j; hj) 
(A): iS[h],S[£]) e J{i,j;h,£) and {R[2],R[j]) ^ J{i,j;hJ) 
(□): {(i?W, (^[/i], ^[^])} G J(*, j; /i, ^) 

(o): {(-R[i], 'S'[/i])} = J{i,j; h,£) and i = j , h ^ £, i.e. we have a single interaction. 

Let JA{hj;h,£) denote a tight structure J{i,j;h,£) having type ^, where ^ € A C {\/, A,0,o}. 

In particular, J(^(i,j; h, £) is a tight structure J{i,j] h, £) of type ^. 




cc c r> e «^ 




Figure 9. From left to right: (tjs) of type (v), (A), (□) and (o). 
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Proposition 2.1. Let J{a,b, c, d) be a joint structure, then the following assertions hold: 

(a) ifJ{i,j]h,£) is tight in J{a,b,c,d), then J(i,j;h,i) has type t E {\/, A,D,o} 

(b) any exterior arc is contained in a unique J{a,b,c,d)-(tjs) 

(c) J{a,b,c,d) decomposes into a unique sequence of (tjs) and maximal segments. 

Suppose we are given two exterior arcs S'[ji]), (i?[z2], S'[j2]) G J{hj', h, £). For two J{i, j; h, £)- 

tight structures, JriiRiii], S[ji])), JT{{R[i2], S[j2])) we set 

JT((i?[zi],5[ji])) = JT((i?N,5[j2])) ^ (i?N,5[j2]). 

Suppose Jxii, j',r, s) is a tight structure where i < a < b < j and r < c < d < s. A double- 
tight structure Jurih s) in Jxii, i]r, s), is a joint structure J{i,j;r,s) such that J{i,j;r,s) C 
JT{hj;r,s) and 

(2.4) JDT{i,j;r, s) = (Jt(«, a; r, c), J(a + 1, 6 - 1; c+ 1, - 1), JribJ; d, s)) 

where Jt(«, a; J", c) and Jt(&, j; (i, s) are J(a + 1,6 — l;c + l,d — l)-tight structures, see Fig. [TUl 




Figure 11. Decomposing tights: we show how to decompose a tights of the types (v)i 
(A) or (□) via to Corollary [22] Corollary [23] and Corollary [Ol 
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Corollary 2.2. Let J\j{i,j', r, s) be a tight structure of type y and let = {R[hi], S[£i]) and C2 = 
(i?[/i2], »S'[i?2]) be the minimal and maximal exterior arcs in J{i,j; r, s) and i + I < ii < ji < J — 1- 
Then 
(2.5) 

{R[i + - 1], J{sj,o}{iiJi;r,s),R[ji + l,j - 1]), if Ci (2 
{R[i + 1, ii - 1], JDT{ii,ji\r, s),R[ji + 1, j - 1]) otherwise, 

where J{^,o}(*i7 ji! ''j s) denotes a J[i + 1, j — 1; r, s)-tight of type A or o. 
Of course we have 

Corollary 2.3. Let J/^(i, j;r, s) be a tight structure of type A and let = {R[hi], S[£i]) and C,2 = 
{R[h2\^ £'[^2]) be the minimal and maximal exterior arcs in J{i, j; r, s) and r + I < ri < si < s — I . 
Then 
(2.6) 

J{hj;r+ l,s - 1) = 

{S[r + l,ri- 1], J[/s,o}{i,j;ri,si),S[si + l,s~ 1]), if Ci ^j(ij-r+i,s-i) C2 
{S[r+ l,ri - l],JDTihj;ri,si),S[si + l,s- 1]), otherwise, 

where J^/:^ o} [ill ji',f-, s) denotes a J{i, j;r + I, s ~ l)-tight of type A or o. 

Corollary 2.4. Let J(i, j; r, s) be a tight structure of type □ and set i + 1 < ii < ji < j — 1, then 
Jt{i,j]r,s) decomposes as follows: 

(2.7) Jii + l:r,s) = (i?[z + - 1], J{An}{H,ji;r,s),R[ji + 1]), 

where J{/\.0} (*ij ji; s) denotes a J(z + 1, j — 1; r, s) -tight of type A or □. 

2.1. Proofs. Proof of Proposition [27l] 

Proof. Let (i?[z], S'[j]) be the maximal (rightmost) exterior arc of J{a,b,c,d). We consider the set 
of maximal S'[j])-ancestors, M. In case of AI ^ wc immediately observe J{i,j;h,£) = 

{R[i], S[j]), i.e. J(i, j; h, £) is of type o. Suppose next \M\ = 1. By symmetry we can, without 
loss of generality, assume M = {{R[ii], R[ji])}. Let (i?[io], S'[jo]) the minimal exterior arc being 
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an descendant of {R[ii], R[ji]) and let jg denote either the startpoint of the maximal (i?[io]i 'S'bo]) 
S'-ancestor or set Jq = jo if no such ancestor exists. Then, by construction, J(ii, ji; Jo , j) is tight in 
J{a,b,c,d). Finally, in case of \M\ = 2, i.e. M = {{R[ii],R[ji]),{S[ri],S[si])}. We may, without 
loss of generality, assume that (i?[ii], ^^[ji]) subsumes (>S'[ri], ^[si]). Again we consider the minimal 
descendant of R[ji]), {R[z], S[x]). Let x* be either the startpoint of the maximal S'-ancestor 

of ^[a;]) or x* — x, otherwise. Then J{ii, ji; x* , si) is tight. If is equivalent to 

S'[ri], ^[si]), then J(ii, ji; ri, si) is tight. In the above procedure we have constructed a (tjs), J*, 
of type r G { v, A, □, o} that contains the maximal exterior J(a, 6, c, (i)-arc. By definition of tight 
and the fact that we have noncrossing arcs it follows that any other (tjs) of J(a, b, c, d) is disjoint 
to J*. We proceed by considering the rightmost exterior arc of J{a,b,c,d) that is not contained 
in J*, concluding assertion (c) by induction on the number of exterior arcs of J{a,b, c, d). Since 
any exterior arc of J(a, b, c, d) is contained in a unique (tjs) generated by the above procedure, (b) 
follows, see Fig. [H □ 




Proof to Corollary [2T2] 



Proof. According to Prop. I2.ir b). there exist unique J(i+1, j — 1; r, s)-tight structures J{ii, 12] f, ^i) 
and J(j2, ji; si, s) such that J(ii, 12; r, ri) = Jt (Ci) and J(j2, ji; si, s) = Jt (C2), respectively. We 
have the following two scenarios: in case of j_i.r.s) C21 i-e. Jt(Ci) ~ Jt{C2), we have either 

r = s, in which case J(«i, ji; r, s) is of type o and in view of (5[r], S[s]) ^ Jy{h j; s) J(ii, ji; r, s) 
is of type \/, otherwise. In case of C,i 7^j(i+i j-i:r,s) Csj J{ii,ji]^i s) is a J{i + 1, j — 1; r, s)-double 
tight structure. □ 



Proof of Corollary 12.41 
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Proof. We observe that there exist only one J{i + 1, j — 1; r, s)-tight structure, smce {S[r], S[s]) G 
J{i + l,j — l;r, s). We consider the set M, consisting of arcs that are equivalent to (^[r], ^[s]). 
According to Prop. [TTl (c), we have 



{R[i + 1, H - 1], Ja(«i, ji; r, s),R[ji + 1, j - 1]) forM = 
{R[i + 1, ii - 1], Jn(ii, ji; r, s), R[ji + 1]) otherwise. 



□ 



3. Unique decomposition 



We showed in Section [2] via Prop. \2A\ that an arbitrary joint structure uniquely decomposes into 
a sequence of segments and tight structures. Via the combinatorial corollaries, Cor. 12.21 Cor. 12.31 
and Cor. 12.41 we introduced a unique decomposition procedure for tights, see Fig. [THl and Fig. [HI 
below. 
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Figure 13. Illustration of Cor. lO 



In this section we give the algorithmic interpretation of the above results. In the course of our 
analysis we derive for any joint structure J(l, N; 1, M) a unique decomposition tree via Procedure 
(a), (b) and (c), below, see Fig.fTSl Let us begin by giving an interpretation of Prop. ETTl 
Procedure (a): 

input: a joint structure 7?o ~ J{h j] h, £), which is not i?o-tight or a ms 
output : a unique tree Tai-ffo) = {Va{T), Ea{T)) 
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Figure 14. Illustration of Cor. ITil 




Figure 15. Illustration of Procedure (a), Procedure (b) and Procedure (c) for the joint 
structure J(l,12, 1,8). From left to right we display T'a(l, 12; 1, 8), Ti(5,6;6,9) and 

Te(i?[7,12]). 



Let i < j* < J + 1 and R[j* ,j] be the iJo-nis contain j. In particular, j* = j + 1 in case of such an 
ms does not exist and j* = 1 if itself is a ms. Analogously, we define iS'[£*,^]. We construct 
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the tree Ta{^o) recursively as follows: 
initialization: Va{T) = {i?o} and Ea{T) = 0. 

(al): in case oi j* = j + 1 and £* ~ i.e. i9o is right-tight, then i^o decomposes via Prop. im fb) 
and (c) into a i9o-tight structure ?9i — J{^.A,n,o}(*i; J! ^i: •^) and a joint structure i?2 = Jihh — 
1; h, hi — 1), where i < ii < j and h < hi < £. Accordingly, we have 

(3.1) Va{T) = Va{T)U {1^1,^2}, 

(3.2) Ea{T) = Ea{T) U{{do,^l), {^0,^2)}- 

(a2) otherwise, i^o decomposes into a i9o-right tight structure i?3 = JRxihj* ~l;h,£* — 1) and two 
ms i?4 = R[j*,j], i^s = S[e*,£]. Accordingly, we have 

(3.3) VaiT) = 14(T)U{t93,^?4,^?5}, 

(3.4) K(r) = £;,(T) U {(^0,1^3), (^?o,^?4),(^o,^?5)}. 

We iterate the process until all the leaves of Ta(z?o) are either ^?o-tight structures or -do-ms. 



We proceed by providing an interpretation of Cor. 12.21 Cor. 12.31 and Cor. 
Procedure (b): 

input: a tight structure i?o = Jihj] h,£) 
output : a unique tree Tb{-do) = {Vb{T), Eb{T)) 
initiahzation: Vb{T) = {i?o} and EbiT) = 0. 
We distinguish J{i,j; h,£) by type: 
o: do nothing. 

□ : according to Cor. 12. 4|, i^o decomposes into di = {R[a], R[b]), i?2 = R[i + — 1], ^3 = 
^□.a(*17 ii; ^) and i?4 = R[ji + l,j — 1], which gives rise to 

(3.5) VbiT) = VaiT) U {^1,^2,^3^4,^5}, 

(3.6) EbiT) = £„(r)U{(l^o,^?l),(l?0,^?2),(l^O,l?3),(^?0,^94),(^0,^5)}. 

V: according to Cor l2.21 we consider the set of J(i + 1, j — 1; h, ^)-tight structures, denoted by Af. 
In case of ~ 1, J(i + 1, j — 1; h,£) decompose into a sequence of a J(i + l,j — 1; ft,,£)-tight 
structure i?6 ~ J^{v.o}(* + 15^^ ^',h,£) and two J(i + — l;h,£)-m.s, = R[i + — 1] and 
i?8 = R[ii + ^ !]■ where i < ii < ji < j. Accordingly, 

(3.7) VbiT) = K(T)U{i?i,t?6,i?7,t?8}, 

(3.8) EbiT) = EaiT)U{ido,^l),ido,^6),i^O,^7)A^O,^8)}- 
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In case of > 1, J{i + l,j — l;h,£) decomposes into a sequence consisting of a J{i + l,j — l;h,£)- 
double tight structure = >/dt(* + 1, j ^ 1; ^) and two J{i + l,j — 1; ft., £)-ms. i?7 = + 1, zi — 1] 
and -ds ~ + ^ 1], where i <i\ < ji < j. Accordingly, 



(3.9) 
(3.10) 



EtiT) = Sa(T)U{(7?o,^l),(l9o,l?7),(^0,^8),(^?0,^?9)}. 



Furthermore, let ii < 12 < ji and h < j2 < i, a, J{i + 1, j — 1; /i,^)-double tight structure iJg = 
J_dt(«+1, J — 1; h, £) decomposes into a J(i+1, j— 1; h, ^)-tight structure 1)10 = J{sy.o.A,D}{h,i2', ft, ^2) 
and a J{i + l,j — l;ft, ^)-right tight structure = jRT{i2 + l,.7i;.72 + 1,^)- I-e. 



(3.11) 
(3.12) 



Vb{T) = K(r)U{79io,i?ii}, 

Eb{T) = Sa(T)U{(t99,??io), (^9,^11)}. 



A: analogous to type V via symmetry. 

In Fig.[T7]we give an overview of Procedure (a) and Procedure (b). 



B 



D 



m 

E 



H 



K 



Figure 16. (A), (B): maximal secondary segments (ms) R[i,j], S[r,s], (C): joint 
structure J{i,j;r,s), (D) right-tight structures jRT{i,j',r,s), (E): double-tight structure 
JDT{i,j;r,s), (F): J^{i,j;r,s), a tight structure of type \/, A or □, (G): Jn(i,j;r,s), 
(H): ,J^{i,j;r, s), (J): JA{i,j;r,s) and (K): exterior arc. 



Finally, we have the wellknown [32j secondary structure loop-decomposition 
Procedure (c): 

input: a secondary structure t^o = R[hj] 
output : a tree r,(i9o) = {Vc{T),E,{T)) 
initiahzation: Vb{T) = {t?o} and Ei,{T) = 0. 
We distinguish the following two cases: 

(cl): in case of i?[j]) ^ let 0^^ denote empty segment in which all the vertices are 

isolated. For 1 < j* < j + 1, let be the maximal empty segment that contains R[i]- In 
particular, if j is not isolated, we have j* = j + 1. Let R^[ii,;i* — 1] denote the segment in which 
R[ii\ is connected with i?[j* — l]. Theni?[i, 7] decomposes as follows j] = (?9i = R[i,ii — l\,-d2 = 
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Procedure (a) 

□ — ►m; 
m — ► 



orni EB— ► 



or \ /or 



Procedure (b) 



nn ; — ^ — ] nrn or nn nn or nrn I nrn 



^4 or or m 



m — ► ffl m or I m 

Figure 17. Illustration of Procedure (a) and Procedure (b), notations are given via 
Fig. [16] above. 



R''[ii,j* - 1], = 0^,) and we set 



(3.13) 
(3.14) 



E,{T) = £;,(r)u{(7?o,^i),(i?o, 1^2), (^0,193)}. 



(c2): in case of (i?[i], i?[7]) e i.e. for = we have a decomposition into the 

pair {'di = {R[i\,R[3\), d., = R[a + l,b- 1]). Accordingly, wc have Vc{T) = Vc{T) U {i?4, t^s} and 
E,{T) = i;,(T) U {(^?o, i?4), (i9o,^5)}- 

We iterate (cl) and (c2), until all the leaves in T are cither isolated segments or single arcs. 

For any joint structure, J(l, A^; 1,M), we can now construct a tree, with root J(l,iV; 1, A/) and 
whose vertices are specific subgraphs of J{1,N; l,i\jf). The latter are obtained by successive ap- 
pHcation of Procedure (a) , (b) and (c) , see Fig. [281 To be precise, let H be the graph rooted in 
J(1,A'^;1,M) defined inductively as follows: for the induction basis for fixed J(1,A^;1,M) only 
one, Procedure (a), (b) or (c) apphes. Procedure (a), (b) or (c) generates the (procedure-specific, 
nontrivial) subtrees, Ta, and Tc- Suppose is a leaf of T that has been constructed via Pro- 
cedure (a), (b) or (c). As in case of the induction basis, each such leaf is input for exactly one 
procedure, which in turn generates a corresponding subtree. Prop. 12.11 Cor. 12.21 Cor. 12.31 and 
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Cor. 12.41 imply that H itself is a tree. We denote this decomposition tree by T{1,N] 1, A'/), see 
Fig. [28l Accordingly, we have proved 

Observation 1. For any joint structure, J(l, A^; 1, A/), there exists a unique decomposition tree, 
T{1,N; 1, Af), whose leafs are either interior or exterior J{1,N] 1, M)-arcs or isolated segments. 

As we shall see in Section [SJ the decomposition tree plays a key role for the calculation of the base 
pairing probabilities. To be precise, given a joint structure, J{i,j;h,£), let Tj{l, N;l, M) be the 
decomposition tree of J{1,N;1,M) and let So = { A^; 1, Af ) | J{i,j;h,£) e 7/(1, A; 1, Af)}. 
Then the probability of J{i, j; h, £), denoted by F{i, j; h, £), is given by 



and furthermore 

Observation 2. In general J{i, j; h, £) C A; 1, Af ) is not equivalent to J{i,j; h, £) G Tj{l, A; 1, Af ), 
see Fig. \18[ However, in case of secondary structures, i.e. J{i, j; h, £) = {R[i], R[j]), we have 



(3.15) 





j(i,iV;i,A/)eSo 



(3.16) 



{R[i],R\j]) C J(l, A^; 1, Af ) ^ (f?W, f?b1) e Tj{l, A; 1, Af ). 




Figure 18. J(l,4;2, 3) has the property that J(l,4;2,3) C J(l,4;l,4) but 
J(l,4;2,3) ^rj(l,4;l,4). 



4. From the decomposition tree to the partition function 



We discussed in the introduction the concept of the loop-based partition function of RNA secondary 
structures due to McCaskill [20] . We observed there that the key property for its derivation is the 
unique decomposition into substructures and their recursive analysis. For instance, suppose we 
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are given a tight of type v from which wc remove, by virtue of Cor. 12.21 its outer arc. For this 
purpose, the context of the latter, i.e. its particular arc-configuration has to be taken into account. 
However, once the unique decomposition is established, the existence of specific subclasses of joint 
structures allowing for the dynamic programming of the partition function follows. We remark 
that the particular choice of the latter may not be unique. 

The first step is to extend the standard loop-energy model for secondary structures by introducing 
two new loop-types due to Chitsaz et al. [Tl]: the kissing loop and the hybrid, see Figure [TOl 



4.1. Loops. Having discussed the standard loop types of secondary structures in Section [H we 
proceed now by introducing the loops that contain exterior arcs. 

(4) a hybrid-loop (Hy)is a sequence {{R[ii], S[ji]), . . . , {[R[is], S[js])), where s > 2 and {ir,jr) is 



nested in (ii, ji) such that R[ih + l,ih+i — 1] 



- 0»''+i-i 



■h+i 



and S[jh + l,jh+i - 1] 







ih+i-i 
jh+i 



(5) a kissing-\oop (K) is either a pair, {{R[i], R[j]), R[i + 1, j — 1]), such that there exists at least 
one {R[i],R[j])-chi\d, {R[ii], S[ji]) where i < ii < j or a pair {{S[i], S[j]), S[i + 1, j - 1]), with 
i?[i])-child iR[h],S[ji]) and i < ji < j. 







Figure 19. The two new loop types: the hybrid (top) and the kissing loop (bottom). 



The arguments of Prop. 12.11 Cor. 12.21 Cor. 12.31 and Cor. 12.41 imply that each joint structure can 
uniquely be decomposed into a sequence of loops-a necessary and sufficient condition for the mfe- 
folding of joint structures. As we shall sec in the next section, the unique decomposition and the 
particular choice of loops give rise to specific subclasses via which the partition function can be 
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recursively expressed. Furthermore, following [7], we allow for an initiation energy, i.e. each hybrid 
loop is given an energy penalty of ao. In addition, we allow for a scaling, < (t < 1, of the energy 
contribution of each hybrid loop. As default we set (Tq = 0, tr = 1. 

4.2. Case studies. Consider a joint structure J{i,j;h,£) G T{J{1,N; 1,M)). For the purpose of 
assigning an energy to a substructure, we have to distinguish substructures by their "outer" loop 
type, see Case 1 as well as Fig. [2] and Fig [191 To convey the key ideas we shall restrict our analysis 
to three case studies. 




E M F K 



Figure 20. Context dependency: the labels "E, M, F, K" are defined in Case 1. We 
display from left to right J^{i,j;h,£), J^{i, j; h,£), J^{i,j;h,£) and J^{i,j;h,£), re- 
spectively. 



Given a joint structure J{i,j;h,£) e T(J(1,7V; 1,M)), we set MR{i,j) = {{R[ii], R[ji]) | ii < i < 
J < ji} and Ms{h,£) = {{Sih], S[ji]) \h<h<£< n}. 

Case 1. Suppose we are given a tight structure J^{i, j; h, £). In case of Ms{h, £) = 0, we call S[h, £] 
external and use the notation J^{i, j; h, £). Otherwise, let (S'[io], S'[jo]) be the minimal element 
of Ms[h,£). We denote the type of the loop including (^[zo], S'[jo]), by ^. In case of = M, we 
use the notation J^{i,j;h,£). Otherwise, in case of ^ = K, we write J^{i,j;h,£) or J^{i,j;h,£) 
depending on whether or not Jsy{i, j; h, £) contains the child of {S[io], Sljo]), see Fig. [501 

Case 2. Suppose we are given a double-tight structure, Jorii, j', h,£). Then we arrive at the 
twelve subclasses presented in Figure Indeed, according to Cor. 12.21 there does not exist 
any J^y(j, j; h,£), i.e. AI]i{i,j) U Ms{h,£) ^ 0. Without loss of generality, we may assume that 
MR{i,j) ^ and that {R[ii], R[ji]) G Mjf {i,j) is minimal. In case of Ms{h,£) = 0, we use the 
notation J^'^{i,j;h,£), where Y is the loop type formed by {R[ii], R[ji]) and R[ii + — 1]. 
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K K K K 




K M K E 



Figure 21. The twelve subclasses of JDT{i,j\ h,l) as discussed in Case 2. 

Otherwise, we have Ms{h,£) ^ 0. Let (<S'[i2], 'S'[j2]) be the minimal element. In this case we 
use the notation J^lf^^ {i, j; h, £), where Yi and Y2 are the loop-types formed by {R[ii], R[ji]), 
R[ii + 1, ji - 1] and iS[i2],S[j2\), S[i2 + IJ2 - 1], respectively, see Fig.EH 

Case 3. In case of a right-tight structure, J^!^{i,j;h,£), wc obtain four subclasses. In case of 
{R[j],S[£]) G J^^{i,j;h,£), we say j'^'J^{i,j;h,£) is (rB) and (rA), otherwise. Let {R[ii], S[ji]) 
denote the minimal exterior arc in J^i^{i, j; h, £). According to Prop. EH there exists a unique 
J{i,j;h, £)-tight structure JriRiii], S[ji]), such that {R[ii], S[ji]) G JT{R[ii], S[ji]). In case of 
JT{R[ii], S[ji]) is of type o, i.e. itself and R[i,ii] = 0-^ S[h,ji] ~ 0]^ , we say 

^HT ^^^) is ('^) ^'^'i otherwise. We use the notation J^t'^^'^^ {i^ j] h,£), if Jp!^{i,j; h,£) 
is {lYi) and {rY2), respectively, see Fig. [22] 



4.3. The partition function. In the previous section we discussed specific subclasses of joint 
structures. They were designed to facilitate the recursive construction of the partition function. 
The purpose of this section is to showcase the respective recursions induced by these classes. 
Case 1: J^{i, j;r, s). According to Cor. 12.21 we have three cases: J^{i, j;r, s) decomposes into 
either a J{i — 1, j + 1; r, s)-tight structure of type k, where k G {sj, 0} or a J{i — 1, j + 1; r, s)-doublc 
tight structure and a ms. By definition of J^{h 3 \ the case of a J(i — 1, j + r, s)-tight structure 
of type o is impossible. Considering the type of the loop including {R[^\^ R[j]) and R[i + 1, j — 1]. 
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K 



K 



K 



K 



A 



K 



K 



Figure 22. The four subclasses of J^^ h,l), see Case 3 for details. 



we arrive exactly at the four cases, denoted by /i, /2, and /4, from left to right, displayed in 
Fig. US 




K \ 


j i (n, 




1 1 1" 







Figure 23. The four decompositions of J^{i,j; r, s) via Procedure (b), denoted by Ji, 
I2, I3 and I4, from left to right, respectively. 



Let i < h < £ < j . According to the recurrences displayed in Fig.[23l the partition function satisfies 



for J^{i, j;r, s) the following recursion: 



(4.1) 



Q^(i, j; r, s) = ^(g(/i) + Qih) + Q{h) + g(/4)), 

h,l 
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where 



Qih) 
Qih) 

Qih) 

Qih) 



%i + i,h-i)) 



x(e-(''-'-i)"3/fcT + gm^^ ^ ^ j _ -^^^^ 

Q'i^^ih, I- r, 5)e-(/3i+/32)/feT(g-0-^-l)/33//cT ^ Qk(^ ^ _ 

x(e-(''-»-i)/33/fcT Qk^^ + 1, j - 1)), 

t, r, s)e-("l+2"^)/'=^(g-(^-^-l)"^/'=^Q"^(^ + 1, - 1) 
^g-(/.-»-l)a3ATQm(^ + 1, J - 1)) + Q"(£ + 1, J - l)Q"(i + 1, /i - 1)). 



Case ^; J^'^{i,j;h,£). According to Procedure (b), a double tight structure decomposes into a 
J(i, j; /i, €)-tight structure, J(i, ii; /i, /ii) and a J(i, j; /i, €)-right tight structure, J(ii + 1, j; /ii + l, £). 
We observe that the type of the outer loop of S[h, hi] and S[hi + 1, £] coincides with that of S[h, £], 
i.e. M. Analogously, the outer loop of R[i,ii] and R[ii + denoted by (i?[«oiio])i is of type 
K. Furthermore, at least one of the substructures R[i,ii] and R[ii + contain the child of 
iR[io,jo])- Consequently we arrive at the three scenarios labeled by from left to right by Ji, J2 
and J3 displayed in Fig. [24l Setting 





< 


1 






K K 

















< F 

















M M 

rK.M , 



F K 















Figure 24. The decomposition of J^^ (i, j; h, I) via Procedure (b). The corresponding 
three cases are labeled by from left to right by Ji, J2 and J3, respectively. 



Q^- □(i,zi;r,ji) = g^' (i,ii;r,ji) + g ' («, «i; r,.7i) + (i,ii;r,ji). 



the recursion of the partition function for J^'^{i,j; h,£) is given by: 

(4.2) g^T(^j;^,^) = EW("^i) + Q("^2) + g(j3)), 
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where 

Q(-^i) = Q^'^^o('^,ii;r,ji)Q^^{ii + l;ji + l,s) 

Q{J2) = Q^^,n(i,ii;r, + 1; ji + l,s)) 

Q{Jz) = Q^if^^,u^i,H]r,ii)Q^jy^{ii + l-,ji + l,s). 

Case 3: J^'^'^'^ {ij; e). By definition of J^'^'^'^ {i, j; h, £), we have {R[j],S[e]) £ J{i,j;h,£). 
We consider the set of exterior arcs in J(i, j ~ l;r,£ ~ 1), W. In case 

decomposes into — S[h,l—1\ and S'[^]). This is the leftmost (first) case [Li] displayed 

in Fig.[25l Otherwise, let {R[ii\, S[ji\) denote the maximal exterior arc in J(i,j — \; r, £—1). We con- 
sider the unique J{i,j; r, ^)-tight structure which contains {R[ii], S[ji]), denoted by JT(i?[ii], S'[ji]). 
If JT{R[ii], S[ji]) has not type o, we have the second case (L2) displayed in Fig. [^H Otherwise, 
depending on whether or not R[ji + 1, j — 1] = and S[hi + 1,£ — 1] = 0^j^^i, we have the 

third (L3) and fourth case (^4), displayed in Fig. 1251 Consequently, we arrive at: 




Figure 25. The four decomposition scenarios of J^'^'^'^ {i, j;r, s) via Procedure (b). 
We denote the corresponding cases from left to right by Li, L2 L3 and L4, respectively. 



(4.3) Q''dT'''^ihj;hJ)^ ^(Q(Li) + 0(L2) + Q(i3) + Q(i4)), 

where Q(ii) = g^(i, j - l)Q^,{h,£-l) and 

Q(L2) = Qht '''^(^: hi){Q\n + 1, J - 1) + Q\hi + 1,^ - 1) + Q\j, + 1, j - l)Q\h, + l,£ 
Q{L3) = Q^;^'^^^(z,ji;/i,/ii)e-"°+^'"'(^''^^-'"i'''i) 

Q{U) = g^^'''^^(i, ji; h,hi){Q\ji + 1, J - 1) + Q\hi + !,£-!) + Q\,h + l,i - l)Q\hi + 1,£ 
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5. Base pairing probabilities 

We have seen in Section [3] that the probability of a joint structure, J(l, N; 1, M), is given by 

(5.1) P( J(l, N; 1, M)) = i_e-^(-^(i'^'i'*^»/'=^, 

where ~ /(i at-i Af) section, we shall calculate the base pair 

probabilities (BPP) for interior and exterior arcs. The key idea is here to associate the prob- 
ability of specific substructures contained in the decomposition tree. In other words, a term 
Qf'^''^''^\iJ]h,£) in the recursive calculation of the partition function gives rise to the proba- 
bility P^^'^^'^^'^*" /i, ^). For instance, F^^'^'^ {i,j;h,£) is, by construction, the sum over all the 
probabilities of joint structures J(l, N; 1, M) such that J{i,j; h,£) is contained in T(J(1, TV; 1, A/)) 
and J{i,j; h,£) — J^'^'^'^ {i, j', h,£). We remark that the above observations reduce the computa- 
tion of the BPP to a trace-back routine in the decomposition tree, constructed in Section [3] 

The basic strategy can be sketched as follows: 

(a) derive from the recursion of the partition function the corresponding recursion of the proba- 
bilities 

(b) partition the substructures according to their respective contribution to the partition function 

(c) for each subclass, recursively calculate the probability of substructures via tracing back the 
decomposition tree. 

We recah that Do = {J(l,iV; | J(i, j; /i, G T( J(l, iV; 1, iV/))}. The probability P(i,j; /i, ^) 

is given by 

(5.2) F{t,j;h,£)^ FiJil,N;l,M)). 

,/(l,Af;l.M)eSo 

We accordingly set 

(5.3) P[-^^'^-^^(*,j;^^)= Yl P(J(l,iV;l,M)), 

J(l,A':l,A/)eAo 

where Aq = { J(l, TV; 1, Af ) | Jii, j; £) e T{J{1, N;l, M)), J{iJ; h,£) G ^^^■^^^^^(i,i; /i,£)}. 

5.1. Base pairing probabilities for RNA secondary structures. In order to illustrate the 
concept, let us put the calculation of the BPP for secondary structures into the context of our 
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backtracking routine. Given a secondary structure R of length N, the probability of R is given by 
P(i?) = ^e"^'^-'/'"""^. In order to calculate the probability of R[i] being connected to R[j] in the 
equilibrium ensemble of structures, P(iij, jij), the first objective is to express the probability of this 
base pair into a sum of probabilities of substructures. Let T{R[1, N]) be the decomposition tree of a 
particular secondary structure N] via Procedure (c) and fl{iR,jR) = {S \ {R[i],R[j]) E T{S)}. 
We remark that i^{iR,jR) coincides set of secondary structure such that R[i] is bound with R[j], 
see Section 131 Observation 2. Then we have 

(5.4) P(*«,J-«) = ^^^^^^^^^^. 

Let R^{i,j) denote the set of segments R[i,j] in which R[i] is connected with R[j] and R[i,j] G 
T{R[1, N]). By construction, F''{iR,jR) is the probability of R^{i,j). According to Procedure (c), 
we have P(ii?,jfl,) = P^ii?,^^) since {R[i\,R\j]) G r( J(l, iV; 1, M)) if and only if the parent of 
R[j]) in the decomposition tree belongs to R^{i,j). Therefore the problem is reduced to the 
calculation of P^{iR, Jr). Inspection of Procedure (c) shows, that for the parent of an element 
of R^{i,j) we have to distinguish the five cases displayed in Fig. [551 Let R™'{i, j) denote the set 



Hairpin 



Interior 
\ MultL 



M2 



Multi 



m 



Figure 26. Tracing back: for a parent of R''{i,j) we have according to Procedure (c), 
five cases, labeled from top to bottom by Li, L2, I/3, 1/4 and L5. For a parent of R"^(i,j) 
there are two cases, denoted by M\ and M2. 

of segments R[i,j] G T{R[\,N]) such that -R[i,j] 7^ 0^, where the outer loop has type M. Let 
R^iiJ) denote the set of segments R[i,j] G T{R[1,N]). In particular, R''{1,N) R[1,N]. Set 
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ji?) aiid P^{iB.,jR) be the probability of R''"-{i, j) and R''{i,j), respectively. Then we have 
= P(ii) + P(L2) + P(L3) + P(i4) + P(i5), where 

'(/i,z-l)Q''(i,j) 



p(Li) = 5]r(/i,^)- 



p(i2) = E^p'e^'^)- 



j)g-lnt(ij:/i,f)/feT 

Accordingly, the recurrence formulae for P™(i, j) and P''(i, j) are given as follows: 

P"(^,J) = P(A/i)+P(M2) 
P(Mi) = ^P^(^-l,^)^ <^U + i,/iW KhJ) 



h.i 



fih'h) = ^P'"(i,£)- 



{i,:i)Q\j + ^,h) 



5.2. Base pairing probabilities for joint structures. Following the basic strategy, we first 
express the BPP via the probabilities of particular substructures. In the following, we abbreviate 
J(l, A''; 1, M) by J. In order to calculate P(i_R, j^), let Si = { J | {R[i],R[j]) G J}, wc consider the 
parent of in the T{J) and accordingly obtain 

(5.5) = {J \ R''[t,j] eT{J)}u\J{J \ J^{i,j;h,i) eTiJ)}u\J{J \ Ja{t,j;K£) eT{J)}, 

h,l h/ 

which immediately leads to 

(5.6) P(*fl, Jfl) = V\iR,jR) + Y.^vihJ■^h,£) + J2Vai^,J■, h, £) , 
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where 



(5.7) 
(5.8) 



P^(z, j; h, i) + P*^ (i, j; h, i) + P^(*, j; h, £) + (z, j; h, £), 
¥i:!,{i,j;hJ)+F^{i,j;h,i)+P^{i,j;hJ)+F^{t,j;hJ). 



Analogously, for P{is,js) we set 



(5.9) 



E2 - { J I S'iK £] e r( J)} u \J{J I Ja(*, j; K £) e T( J)} 



and obtain 



(5.10) 



where 



(5.11) 



FA{h, t- = Pf {h, £■ i , j) + P^ (/i, i , i) + Pf (/i, ^; j) + P^ (/i, i , j). 



We remark that the expressfons for the BPP P(i/?, ji?) and P(«s,is) are not symmetric. This is 
due to the fact that in our decomposition routines always the outer arcs contained in R are given 
preference. In other words, the asymmetry is a result of our particular construction. Finally, we 
calculate the binding probability of an exterior arc (i?[i], 5[j]). Since (i?[z], S'[j]), being a tight 
structure of type o, is already substructure, we can skip the first two steps of the basic strategy. 
In order to compute the binding probabilities of both: interior and exterior arcs, the key is to 
employ an "inverse" grammar induced by tracing back in the decomposition tree as displayed in 
Fig. 1271 By virtue of this backtracking, we obtain the recurrence formulae in analogy to the case 
of secondary structures, discussed above. 



In this paper we derive the partition function and the base pairing probabilities of RNA interaction 
structures. Furthermore we present the algorithm rip that computes the partition function and the 
base pairing probabilities in OiN^M"^) + 0[N'^M^) time and 0(N'^M'^) space. 

While the partition function is due to [llj our construction is independently derived and based 
on two ideas: the concept of tight structure in Section [1] and the decomposition tree, presented in 
Section[ni We did however, adopt the notions of kissing and hybrid loops from [TT]. The derivation 
of the base pairing probabilities for joint structures is new. Here the key idea is to express the 
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latter via energy-wise "quantifiable" substructures, that are contained in the decomposition tree. 
We discussed that in contrast to the computation of the base pairing probabilities of secondary 
structures, the specific construction of the unique grammar factors in. As a result, being a joint 
substructure containing a certain base pair, is not the correct criterion any more. Only those 
substructures that are obtained via tracing back in the decomposition tree contribute to the base 
pairing probability. 

The complete set of partition function recursions and all details on the particular implementation 
of rip can be found at 

http://www.combinatorics.cn/cbpc/rip.html 



Finally, we also compute the generating function of joint structures. The analysis of this function 
is beyond the scope of this paper and can be found as supplemental material at the above web-site. 
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Figure 27. Illustration of the "inverse" grammar, obtained by back-tracing in the 
decomposition tree. 
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Figure 28. The decomposition tree T{l,N; 1,M). 



